Financial Contributions to 2016 Presidential Campaigns in Florida by Noha Alawwad

Abstract

This is an exploration of 2016 US presidential campaign donations in Florida state. The data was collected from Federal Election Commission website. The dataset contains financial contribution transaction from October 1, 2013 to December 31, 2016.The reason of choosing data from Florida in particular is because Florida is considered as “Swing State”.

Note: The term swing state refers to any state that could reasonably be won by either the Democratic or Republican presidential candidate. These states are usually targeted by both major-party campaigns, especially in competitive elections.

Data Loading & Summary

This dataset contains 426057 contributions and 18 variables summrized as follow:
  • cmte_id: committee id
  • cand_id: candidate id
  • cand_nm: candidate name
  • contbr_nm: contributor name
  • contbr_city: contributor city
  • contbr_st: contributor state
  • contbr_zip: contributor zip code
  • contbr_employer: contributor employer
  • contbr_occupation: contributor occupation
  • contb_receipt_amt: contribution receipt amount
  • contb_receipt_dt: contribution receipt date
  • receipt_desc: receipt description
  • memo_cd: memo code
  • memo_text: memo text
  • form_tp: form type
  • file_num: file number
  • tran_id: transaction id
  • election_tp: election type/primary general indicator
## 'data.frame':    426057 obs. of  18 variables:
##  $ cmte_id          : chr  "C00580100" "C00580100" "C00575795" "C00577130" ...
##  $ cand_id          : Factor w/ 25 levels "P00003392","P20002671",..: 23 23 1 12 1 23 23 23 23 1 ...
##  $ cand_nm          : Factor w/ 25 levels "Bush, Jeb","Carson, Benjamin S.",..: 23 23 4 20 4 23 23 23 23 4 ...
##  $ contbr_nm        : Factor w/ 113981 levels "& GRACE DRISCOLL, CORY",..: 92995 93000 91255 59125 17739 82858 82865 87944 82876 73603 ...
##  $ contbr_city      : Factor w/ 1869 levels ""," PEMBROKE PINES",..: 264 1749 1324 601 1673 145 497 362 1578 1839 ...
##  $ contbr_st        : Factor w/ 1 level "FL": 1 1 1 1 1 1 1 1 1 1 ...
##  $ contbr_zip       : num  3.38e+04 3.30e+04 3.33e+08 3.20e+08 3.23e+08 ...
##  $ contbr_employer  : Factor w/ 30093 levels "","#NAME?","'SHELL LUMBER",..: 22308 21451 26353 19057 25863 20394 22308 13442 20480 18657 ...
##  $ contbr_occupation: Factor w/ 12516 levels ""," RETIRED AGRICULTURE INSPECTOR",..: 9653 1077 2639 7352 4431 6066 9653 5536 927 238 ...
##  $ contb_receipt_amt: num  68.4 80 15 50 100 ...
##  $ contb_receipt_dt : Factor w/ 723 levels "01-APR-15","01-APR-16",..: 204 437 491 106 117 29 418 602 709 2 ...
##  $ receipt_desc     : Factor w/ 46 levels ""," SEE REATTRIBUTION",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ memo_cd          : Factor w/ 2 levels "","X": 2 2 2 1 2 2 2 2 2 2 ...
##  $ memo_text        : Factor w/ 209 levels ""," SEE REATTRIBUTION",..: 1 1 21 11 21 1 1 1 1 21 ...
##  $ form_tp          : Factor w/ 3 levels "SA17A","SA18",..: 2 2 2 1 2 2 2 2 2 2 ...
##  $ file_num         : int  1146165 1146165 1091718 1077404 1091718 1146165 1146165 1146165 1146165 1091718 ...
##  $ tran_id          : Factor w/ 424363 levels "A001647F906B94BC7AAD",..: 290615 279692 126484 372156 125779 286015 276990 340354 301688 125579 ...
##  $ election_tp      : Factor w/ 5 levels "","G2016","O2016",..: 2 2 4 4 4 2 2 2 2 4 ...

Creating New Variables

To start, I need to create new variables such as
  • Party: candidate’s political party affiliation.
  • contbr_first_nm: contributor’s first name, parsed from contbr_nm variable, used for predicting gender.
  • gender: contributor’s gender.
  • date: contribution date

Univariate Analysis Section

1. Contribution Amount


First, I want to have a quick look on how the contribution amounts are distributed.

##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -20000.0     15.0     30.0    134.8    100.0  20000.0

From the previous histogram and summary of contribution, I realized that the data is widely dispersed since it has extreme high and extreme low values. Also, I see that it has negative values too.

Transforming the data could help to see patterns more clearly, but before doing the transformation I want to remove the negative values in contb_receipt_amt variable which could be considered as wrong entry or as a refunded amount. Also, fix the upper limit of the same variable to be ($2700) per contributor as stated in Contribution Limits for 2015-2016.

Data Cleaning

# Removing negative values & limiting the upper limit to 2700
FL = filter(FL, FL$contb_receipt_amt > 0 & FL$contb_receipt_amt <= 2700)

Data Transformation

# Transforming data using log10
ggplot(aes(x = contb_receipt_amt), data = subset(FL, !is.na(contb_receipt_amt))) +
  geom_histogram(bins = 0.05, binwidth= 0.05, color = I('black'),
                 fill=I("darkslateblue")) +
  scale_x_log10()

summary(FL$contb_receipt_amt)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.15   19.00   35.00  143.30  100.00 2700.00


After doing the preprocessing, I can see from histogram and summary that most donors contributed small amount of donations and only few of them made large amount of donations.

## 
##     5    10   100    50    25 
## 23044 33032 41674 42740 56461


Sorting the donations amount helped me to discover the most frequent donation amount which are 5, 10, 100, 50, and then 25 dollars.

2. Date


Contributions usually vary and increase as it gets closer to the election date. Therefore, it is useful to illustrate this fact by the count of Contributions during our study period and know the date that has the higher contribution number:

ggplot(aes(x = date), data = FL) +
  geom_histogram(binwidth = 30, position = position_dodge(), color = I("black"),
                 fill = I("darkslateblue")) +
  xlab('Date') +
  ylab('Number of Contributions') +
  ggtitle('Histogram of Contribution Date')+
  scale_x_date(breaks = date_breaks("6 month"),
             labels = date_format("%Y-%m")) +
theme(axis.text.x  = element_text(angle = 45, vjust = 1, hjust = 1))


It is obvious that contribution number increased in 2016, especially from March. The closer the date to the election date, the higher the contributions number.

3. Occupation


To know what the professions of the individuals who made high contributions, top 10 occupations in term of contributions number were selected and investigated in the following plot.

occupation_group = FL %>%
  filter(contbr_occupation != '',
         contbr_occupation != 'NA',
         contbr_occupation != 'None',
         contbr_occupation != 'INFORMATION REQUESTED',
         contbr_occupation != 'INFORMATION REQUESTED PER BEST EFFORTS') %>%
  group_by(contbr_occupation) %>%
  summarise(total_donation = sum(contb_receipt_amt),
            donors_num = n(),
            avg_donation = total_donation/donors_num) %>%
  arrange(desc(donors_num)) %>%
  top_n(10, donors_num)


occupation_group$contbr_occupation = factor(occupation_group$contbr_occupation, levels = unique(occupation_group$contbr_occupation)[order(occupation_group$donors_num, decreasing = TRUE)])

occupation_group
## # A tibble: 10 x 4
##    contbr_occupation total_donation donors_num avg_donation
##    <fct>                      <dbl>      <int>        <dbl>
##  1 RETIRED                15619777.     130622        120. 
##  2 NOT EMPLOYED            1469443.      32415         45.3
##  3 ATTORNEY                3376859        9811        344. 
##  4 TEACHER                  325967.       5848         55.7
##  5 PHYSICIAN               1209061.       5571        217. 
##  6 HOMEMAKER               2051411.       5466        375. 
##  7 CONSULTANT               870123.       3984        218. 
##  8 SALES                    390145.       3972         98.2
##  9 PROFESSOR                291717.       3696         78.9
## 10 ENGINEER                 309171.       3093        100.0
ggplot(aes(x = contbr_occupation, y = donors_num ), data = occupation_group) +
  geom_bar(stat = 'identity', color = I('black'), fill=I("darkslateblue") ) +
  geom_text(stat='identity', aes(label=donors_num), data = occupation_group, vjust = -0.5, size=3) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  ylab("Number of Contributers") + 
  xlab("Top 10 Occupations") +
  ggtitle("Contributers Count of Top 10 Occupations")

Retired people take the first place in terms of the number of contributions followed by not employed people and attorney comes to the third.

4. Party

sum(FL$contb_receipt_amt)
## [1] 58150998
DFparty = FL %>% group_by(party) %>% 
  summarise(total_donation = sum(contb_receipt_amt),
            candidate_number = length(unique(cand_id)),
             contr_n = n() ,
            avg_amount = round(total_donation/contr_n) )%>%
           
   arrange(desc(contr_n))

DFparty
## # A tibble: 2 x 5
##   party      total_donation candidate_number contr_n avg_amount
##   <chr>               <dbl>            <int>   <int>      <dbl>
## 1 democrat        25019773.                5  256825         97
## 2 republican      33131224.               20  149069        222
ggplot(aes(x = party, fill = party), data = FL) +
  geom_histogram(stat = "count") +
  xlab('Party') +
  ylab('Number of Contributions') +
  ggtitle('Histogram of Contribution Number by Party') +
  scale_fill_manual(values = c("#084594", "#b30000"))


Total Contributions counts for the five Democrat candidates were more than that for the Republican candidates despite the fact that Republican candidates were twenty candidates compared to five democratic. Sixty-three percent of donations count received by the democratic


5. Candidates

table(FL$cand_nm)
## 
##                 Bush, Jeb       Carson, Benjamin S. 
##                      5407                     15505 
##  Christie, Christopher J.   Clinton, Hillary Rodham 
##                       202                    176922 
## Cruz, Rafael Edward 'Ted'            Fiorina, Carly 
##                     27487                      2003 
##      Gilmore, James S III        Graham, Lindsey O. 
##                         1                       162 
##            Huckabee, Mike             Jindal, Bobby 
##                       397                        22 
##             Johnson, Gary           Kasich, John R. 
##                       800                      1303 
##          Lessig, Lawrence            McMullin, Evan 
##                        38                        93 
##   O'Malley, Martin Joseph         Pataki, George E. 
##                       147                        19 
##                Paul, Rand    Perry, James R. (Rick) 
##                      1952                        33 
##              Rubio, Marco          Sanders, Bernard 
##                     18076                     79686 
##      Santorum, Richard J.               Stein, Jill 
##                        84                       463 
##          Trump, Donald J.             Walker, Scott 
##                     74718                       342 
##     Webb, James Henry Jr. 
##                        32
ggplot(aes(x = cand_nm, fill= party), data = FL) + 
  geom_bar() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  xlab('candidate') +
  ylab('Number of Contributers') +
  ggtitle('Number of Contributers per Candidate')+
scale_fill_manual(values = c("#084594", "#b30000"))

Hillary Clinton was the leader in the number of contributions among the 25 candidates, followed by Bernard Sanders, then Donald Trump. Within all parties, majority of the donations were received by only few candidates. For Democratic party, Hillary Clinton and Bernard Sanders take almost 99.9% of all donations to the Democratic party, and of which, 68.8% went to Hillary Clinton. For the Republican party, Donald Trump received 50% of all donations to the Republican party. Donald Trump, Ted Cruz, Marco Rubio, Ben Carson, and Jeb Bush all together received 94.7% of all donations made to their Republican party, the remaining 5.3% was shared by the other 15 Republican candidates.


6. Gender

## # A tibble: 2 x 4
##   gender   sum_gen  n_gen avg_gen
##   <chr>      <dbl>  <int>   <dbl>
## 1 female 22675813. 199365    114.
## 2 male   35475185. 206529    172.
ggplot(aes(x = gender,y=n_gen,fill = gender), data = contr_by_gen) + 
  geom_bar(stat='identity', width = 0.8) + 
  geom_text(stat='identity',aes(label=n_gen), data = contr_by_gen, vjust = -0.5, size=3) +
  ylab("Contributers Count") + 
  xlab("Gender") + 
  ggtitle("Contributers Count by Gender") + 
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))


The barplot shows that both genders highly contributed in the presidential campaigns. However, with slight difference, men contributed 3.6% more than women.


Bivariate Analysis Section

1. Contribution Amount and Parties

ggplot(aes(x = contb_receipt_amt, fill = party), 
       data = FL) + 
  geom_histogram(position = position_dodge(), color=I('black'), binwidth = 90) +
  ylab("Contribution Count") + 
  xlab("Contribution Amount") +
  ggtitle("Histogram of Contribution Amount by Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))


Previous histogram illustrates that the distribution of contributions made in Florida is dominated by the small amount of donations. This result is expected since our study in only covering the individual supporters.

ggplot(aes(x = party, y = contb_receipt_amt, fill = party),
       data = subset(FL)) + 
  geom_boxplot() + 
  ylab("Contribution Amount") + 
  xlab("Party") +
  ggtitle("Boxplot of Contribution Amount by Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))


The Boxplot shows that there are high number of outliers since it is hard to explore the data. Transforming the data using log10 helped me to see the distribution more clearly.

ggplot(aes(x = party, y = contb_receipt_amt, fill = party),
       data = subset(FL)) + 
  geom_boxplot() + 
  scale_y_log10() + 
  ylab("Log10 Contribution Amount") + 
  xlab("Party") +
  ggtitle("Boxplot of Contribution Amount by Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))


Now, I can see that the distribution of donations for Democrats are dispersed, it includes very high and very low donations amounts, more than that for the Republicans.
I felt inquisitive to know which party received more donations amounts, because Florida is a swing state and there is no clue which party will win.


p1 <- ggplot(aes(x = party, y = total_donation/1000, fill = party), data = DFparty) + 
  geom_bar(stat='identity') + 
  geom_text(stat='identity', aes(label = round(total_donation/1000)), data = DFparty, vjust = -0.3) +
  ylab("Total Contributions (thousands USD)") + 
  xlab("Party") + 
  ggtitle("Total Contributions Amount\ 
          by Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))

p2 <- ggplot(aes(x = party, y = avg_amount, fill = party), data = DFparty) + 
  geom_bar(stat='identity') + 
  geom_text(stat='identity', aes(label = avg_amount), data = DFparty, vjust = -0.3) +
  ylab("Average Contributions (USD)") + 
  xlab("Party") + 
  ggtitle("Average Contributions Amount\ 
          by Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))

grid.arrange(p1, p2, ncol = 2)


Left barplot illustrate that total donation amounts for Republicans are 8.1 million more than Democrats donation amounts. Republican donation amounts reached 33.1 million dollars. By taking the average of donation amounts for each party, as shown in the second barplot, we can till that the average donation amounts for democrats are also less than Republican by almost 125 dollars. This is make sense because as we know that democrats contributors are 63% of the total contributors.

2. Contribution Amount and Candidates

This time I want to dig deep into candidate’s level and know which candidate received the highest donations amount?


## # A tibble: 25 x 4
## # Groups:   party [2]
##    party      cand_nm                 total_donation donor_num
##    <chr>      <fct>                            <dbl>     <int>
##  1 republican Gilmore, James S III              500          1
##  2 democrat   Webb, James Henry Jr.           14400         32
##  3 democrat   Lessig, Lawrence                16058.        38
##  4 republican McMullin, Evan                  20792.        93
##  5 republican Pataki, George E.               21300         19
##  6 republican Jindal, Bobby                   29350         22
##  7 republican Santorum, Richard J.            64639.        84
##  8 republican Perry, James R. (Rick)          74200         33
##  9 republican Stein, Jill                     96072.       463
## 10 democrat   O'Malley, Martin Joseph        152823.       147
## # ... with 15 more rows


From the horizontal bar chart, we can tell that more than half of the donation amount in Florida goes to Hillary Clinton and Donald Trump. They received 21.48 and 12.52 million dollar, respectively. Moreover, this indicates that Clinton is the leading candidate of Democrat party in Florida with donation amount seven times the donations of Bernard Sanders who belongs to the same party. However, Republican party made Trump to be the most donation receiver. Marco Rubio, and Bush, Jeb came in the second level with more than 6 million dollar each.


3. Contribution Amount by Gender

p1 <- ggplot(aes(x = gender,y=sum_gen,fill = gender), data = contr_by_gen) + 
  geom_bar(stat='identity', width = 0.8) + 
  geom_text(stat='identity',aes(label=round(sum_gen/1000)), data = contr_by_gen, vjust = -0.5, size=3) +
  ylab("Total Contribution (thousands USD)") + 
  xlab("Gender") + 
  ggtitle("Total Contribution Amount\
          by Gender") + 
  theme(plot.title = element_text(size = 12))+
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))

p2 <- ggplot(aes(x = gender,y=avg_gen,fill = gender), data = contr_by_gen) + 
  geom_bar(stat='identity', width = 0.8) + 
  geom_text(stat='identity',aes(label=round(avg_gen)), data = contr_by_gen, vjust = -0.5, size=3) +
  ylab("Average Contribution (USD)") + 
  xlab("Gender") + 
  ggtitle("Average Contribution Amount\
          by Gender") + 
  theme(plot.title = element_text(size = 12))+
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))
grid.arrange(p1, p2, ncol=2)


Although the number of contributors is almost the same for male and female, we can see from the two barplots that total and average donations amount made by females are smaller compare to that of males.

4. Contribution Amount by Occupation

occupation_group = FL %>%
  filter(contbr_occupation != '',
         contbr_occupation != 'NA',
         contbr_occupation != 'None',
         contbr_occupation != 'INFORMATION REQUESTED',
         contbr_occupation != 'INFORMATION REQUESTED PER BEST EFFORTS') %>%
  group_by(contbr_occupation) %>%
  summarise(total_donation = sum(contb_receipt_amt),
            donors_num = n(),
            avg_donation = total_donation/donors_num) %>%
  arrange(desc(avg_donation)) %>%
  top_n(10, donors_num)


occupation_group$contbr_occupation = factor(occupation_group$contbr_occupation, levels = unique(occupation_group$contbr_occupation)[order(occupation_group$avg_donation, decreasing = TRUE)])
ggplot(aes(x = contbr_occupation, y = avg_donation), data = occupation_group) +
  geom_bar(stat = 'identity', color = I('black'), fill=I("darkslateblue") ) +
  geom_text(stat='identity', aes(label= round(avg_donation,2)), data = occupation_group, vjust = -0.5, size=3) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))+
ylab("Contributions Average Amount") + 
  xlab("Top 10 Occupations") +
  ggtitle("Average Contributions Amount of Top 10 Occupations")

In the previous section, we have observed that Retired people were the highest in term of contribution number. However, when we look at the average contribution amount, homemaker, where most of them are women, comes to the leading position, and attorney takes the second place. Unemployed people contribute the least on average which is not surprising fact.

Multivariate Analysis Section

1. Contribution, Candidates, and Gender

candidates <- c("Clinton, Hillary Rodham",
                "Sanders, Bernard", 
                "Rubio, Marco", 
                "Bush, Jeb", 
                "Trump, Donald J.")

candidate_group <- FL %>% 
  filter(FL$cand_nm %in% candidates) %>% 
  group_by(gender, cand_nm) %>% 
  summarise(total_donation = sum(contb_receipt_amt),
            donor_num = n(),
            avg_donation = round(total_donation/donor_num)) %>% 
  arrange(donor_num, gender)


candidate_group$cand_nm = factor(candidate_group$cand_nm, levels = unique(candidate_group$cand_nm)[order(candidate_group$donor_num, decreasing = TRUE)])

candidate_group
## # A tibble: 10 x 5
## # Groups:   gender [2]
##    gender cand_nm                 total_donation donor_num avg_donation
##    <chr>  <fct>                            <dbl>     <int>        <dbl>
##  1 female Bush, Jeb                     2236166.      1944         1150
##  2 male   Bush, Jeb                     4093789.      3463         1182
##  3 female Rubio, Marco                  2348246.      6694          351
##  4 male   Rubio, Marco                  4295700.     11382          377
##  5 female Trump, Donald J.              3743498.     27728          135
##  6 female Sanders, Bernard              1418184.     36320           39
##  7 male   Sanders, Bernard              1940525.     43366           45
##  8 male   Trump, Donald J.              8771535.     46990          187
##  9 male   Clinton, Hillary Rodham      11193100.     69419          161
## 10 female Clinton, Hillary Rodham      10284683.    107503           96
ggplot(aes(x = cand_nm, y=donor_num, fill = gender), data = candidate_group) + 
  geom_bar(stat='identity', width = 0.9, position = position_dodge()) + 
  coord_flip() +
  ylab("Total Contributers") + 
  xlab("Candidates") + 
  ggtitle("Total Contributers per Candidate\ 
 by Gender") + 
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))

ggplot(aes(x = cand_nm, y=total_donation/1000, fill = gender), data = candidate_group) + 
  geom_bar(stat='identity', width = 0.9, position = position_dodge()) + 
  coord_flip() +
  ylab("Total Contribution Amount (thousands USD)") + 
  xlab("Candidates") + 
  ggtitle("Total Contribution Amount per Candidate\ 
 by Gender") + 
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))

ggplot(aes(x = cand_nm, y=avg_donation , fill = gender), data = candidate_group) + 
  geom_bar(stat='identity', width = 0.9, position = position_dodge()) + 
  coord_flip() +
  ylab("Average Contribution Amount (thousands USD)") + 
  xlab("Candidates") + 
  ggtitle("Average Contribution Amount per Candidate\ 
 by Gender") + 
  scale_fill_manual(values = c("#e78ac3", "#66c2a5"))


Although we had observed in first section that males made a slight higher contribution number than females, However, looking at top five candidates in term of contributions number, women contributed more than men for Hillary Clinton. This make sense as she is the dominate female candidate. In contrast, looking at the total and average donations amount per candidate, men always made higher donation amount than women.


2. Contribution, Party, and Gender

party_gender_group <- FL %>%
  group_by(gender, party) %>%
  summarise(total_donation = sum(contb_receipt_amt),
            donor_num = n(),
            avg_donation = total_donation/donor_num) %>%
arrange(total_donation)
party_gender_group
## # A tibble: 4 x 5
## # Groups:   gender [2]
##   gender party      total_donation donor_num avg_donation
##   <chr>  <chr>               <dbl>     <int>        <dbl>
## 1 female republican      10933149.     55494        197. 
## 2 female democrat        11742664.    143871         81.6
## 3 male   democrat        13277109.    112954        118. 
## 4 male   republican      22198076.     93575        237.
ggplot(aes(x = gender,
           y=donor_num/1000,
           fill = party),
       data = party_gender_group) + 
  geom_bar(stat='identity',
           width = 0.9,
           position = position_dodge()) + 
  geom_text(stat='identity',
            aes(label=round(donor_num/1000)),
            data = party_gender_group,
            position = position_dodge(0.9),
            vjust = -0.04,
            size=3)+
  ylab("Total Contributers (thousands)") + 
  xlab("Gender") + 
  ggtitle("Total Contributers by Gender and Party") + 
 theme(plot.title = element_text(size = 12),
        axis.text = element_text(size = 12))+
  scale_fill_manual(values = c("#084594", "#b30000"))

ggplot(aes(x = gender,
           y = total_donation/1000,
           fill = party),
       data = party_gender_group) + 
  geom_bar(stat='identity',
           width = 0.9,
           position = position_dodge()) + 
  geom_text(stat='identity',
            aes(label=round(total_donation/1000)),
            data = party_gender_group,
            position = position_dodge(0.9),
            vjust = -0.04,
            size=3)+
  ylab("Total Contribution Amount (thousands USD)") + 
  xlab("Gender") + 
  ggtitle("Total Contribution Amount by Gender and Party") + 
  scale_fill_manual(values = c("#084594", "#b30000"))


First barplot illustrates that most Democrat supporters are females, while males almost double the females supporters of Republican party. second plot explain that Republican men donors made almost double the amount of Democrat men donors. However, there is no big difference between the donation amount made by female for both parties.

3. Maps

data(zipcode)
FL$contbr_zip<- clean.zipcodes(FL$contbr_zip)
FL<- merge(FL, zipcode, by.x = 'contbr_zip', by.y = 'zip')
FL <- filter(FL, state == 'FL' )
map_FL = map_data('county', 'florida')
  ggplot(FL, aes(longitude, latitude)) +
  geom_polygon(data=map_FL, aes(x=long, y=lat, group=group),
               color='gray', fill=NA, alpha=.35) +
  geom_point(aes(colour = party, label = party), data = FL, size=1.7,  alpha=.6) +
  ggtitle("Geographical Location of Donors by Party")+
  scale_colour_manual(name = "",
     values=c("#084594", "#b30000"))


It looks like more republicans concentrated at the lower part of Florida around Miami, St. Petersburg, and Fort Myers area. This does make sense as they considered the largest cities in Florida. But the numbers of republican supporters are coloring the state with their red color despite being a swing state.


## clinton map
clinton_zip_Avg <- FL %>% 
  filter(cand_nm == 'Clinton, Hillary Rodham') %>%
  group_by(contbr_zip) %>%
  summarise(value = round(mean(contb_receipt_amt)))

clinton_zip_Avg$region <- as.character(clinton_zip_Avg$contbr_zip)
clinton_zip_Avg<- na.omit(clinton_zip_Avg)
keeps <- c("region", "value")
clinton_zip_Avg <- clinton_zip_Avg[keeps]

clinton_zip_Avg_map <- ZipChoropleth$new(clinton_zip_Avg)
clinton_zip_Avg_map$ggplot_scale = scale_fill_brewer(palette = 1, drop = TRUE)
clinton_zip_Avg_map$set_zoom_zip(state_zoom = "florida",
                             county_zoom = NULL,
                             msa_zoom = NULL,
                             zip_zoom = NULL)
clinton_zip_Avg_map$title = "2016 Florida Average Donation for Hillary Clinton (USD)\
by County"

## trump map
trump_zip_avg <- FL %>% 
  filter(cand_nm == 'Trump, Donald J.') %>%
  group_by(contbr_zip) %>%
  summarise(value = round(mean(contb_receipt_amt))) 

trump_zip_avg$region <- as.character(trump_zip_avg$contbr_zip)
trump_zip_avg<- na.omit(trump_zip_avg)
keeps <- c("region", "value")
trump_zip_avg <- trump_zip_avg[keeps]

trump_zip_avg_map <- ZipChoropleth$new(trump_zip_avg)
trump_zip_avg_map$ggplot_scale = scale_fill_brewer(palette = 8, drop = FALSE)
trump_zip_avg_map$set_zoom_zip(state_zoom = "florida",
                             county_zoom = NULL,
                             msa_zoom = NULL,
                             zip_zoom = NULL)
trump_zip_avg_map$title = "2016 Florida Average Donation for Donald Trump (USD) \ 
by County"

##
p1 <- clinton_zip_Avg_map$render()
p2 <- trump_zip_avg_map$render()

grid.arrange(p1, p2, ncol = 1)



We have seen that Clinton is the leader in term of contributors count and total contribution amount, However, her supporters are located in few counties! On the other side, Trump supporters are distributed over the whole Florida. The dark spot in the upper side of the Trump’s map can tell us that the highest average contribution amount came from Tallahassee, the state capital of Florida.


Final Plots and Summary

Retired people are in the top of contributers list.


Retired individuals were the top contributors in term of their count. That’s related to the significant number of retired people in Florida since it is considered as one of the best states for retirees for many reasons like low-cost of living and no income tax.



Females Tend to Donate More to Females Candidates


As discussed previously, Florida women seem to donate more to democratic candidates more than men. For the female candidates like Hillary Clinton, we observed that women have high receptivity to donate more to female candidates much more than male donors and vice versa.



Trump supporters distributed over most FLorida counties


Trump supporters located in almost every spot of Florida, not like Clinton supporters where they are located in in only few Counties of the state.
The dark spot in the maps reflect a high average donation amounts made by every single contributor in the county. For Trump, darkest spot of the north area indicate that Tallahassee had the highest average contribution amounts, and this make sense as it is the state capital of Florida.


Conclusion

Florida is known to be a swing state with tough presidential competition since it has massive electoral significance. Florida played a major role in 2016 election and, therefore, was selected for our study. The final result for Florida votes supported our argument where Trump won by 49.1% and 4,605,515 votes and Hilary Clinton came second with 4,485,745 votes representing 47.8% of total votes.



Reflection

Future work should be to analyze many presidential campaign data for Florida state over 40 years to see the flow of people preferences in Florida in term of parties, and to analyze the campaign financial data with all 50 states and check if the insights we get from Florida hold true on the national level.

Finally, List of some finding in this project:

Issues

I spent long time struggling with values manipulation of date type (I am not familiar with it).

Resources